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ABSTRACT The human zona pellucida, composed of 
three glycoproteins (ZP1, ZP2, and ZP3), forms an extracel- 
lular matrix that surrounds ovulated eggs and mediates species- 
specific fertilization. The genes that code for at least two of the 
zona proteins (ZP2 and ZP3) cross-hybridize with other mam- 
malian DNA. The recently characterized mouse sperm receptor 
gene (Zp-3) was used to isolate its human homolog. The human 
homolog spans «18.3 kilobase pairs (kbp) (compared to 8.6 
kbp for the mouse gene) and contains eight exons, the sizes of 
which are strictly conserved between the two species. Four 
short (8-15 bp) sequences within the first 250 bp of the 5' 
flanking region in the human Zp-3 homolog are also present 
upstream of mouse Zp-3. These elements may modulate oocyte- 
specific gene expression. By using the polymerase chain reac- 
tion, a full-length cDNA of human ZP3 was isolated from 
human ovarian poly(A) + RNA and used to deduce the structure 
of human ZP3 mRNA. Certain features of the human and 
mouse ZP3 transcripts are conserved. Both have unusually 
short 5' and 3' untranslated regions, both contain a single open 
reading frame that is 74% identical, and both code for 424 
amino acid polypeptides that are 67% the same. The similarity 
between the two proteins may define domains that are impor- 
tant in maintaining the structural integrity of the zona pellu- 
cida, while the differences may play a role in mediating the 
species-specific events of mammalian fertilization. 

Many details of mammalian fertilization have been elucidated 
from studies in the mouse. Fertilization begins when a 
capacitated mouse spermatozoon attaches to the zona pel- 
lucida, an extracellular matrix that surrounds the ovulated 
egg-of mammals. After tight binding to the zona, the dehis- 
cence of the sperm acrosome results in the release of lytic 
enzymes that, coupled with the forward motility of the 
sperm, results in passage through the zona pellucida. The 
sperm then traverses the perivitelline space, fuses with the 
egg's plasma membrane, and is absorbed into the cytoplasm, 
where its nucleus undergoes decondensation and repackag- 
ing into the male pronucleus of the one-cell zygote (1). 

The sperm-egg interaction that takes place in the oviduct 
after ovulation of the egg and insemination of sperm is 
relatively species-specific (2-5). In part, this specificity is 
dependent on the affinity of the spermatozoon for the zona 
pellucida of the homologous species (for review, see ref. 6). 
The mouse zona pellucida is composed of three sulfated 
glycoproteins, ZP1, ZP2, and ZP3, and specific functions 
have been ascribed to each. ZP3 induces the sperm acrosome 
reaction and mediates the initial binding of sperm to the egg 
via O-linked oligosaccharide side chains. ZP2 acts as a 
secondary sperm receptor and it, along with ZP3, is biochem- 
ically modified after fertilization to provide the postfertiliza- 
tion block to polyspermy. ZP2 and ZP3 exist as dimers in long 
filaments that appear to be cross-linked by ZP1 (for review, 
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see ref. 7). The recent cloning of mouse ZP2 and ZP3 cDNAs 
(8-10) and the characterization of their genomic loci (8, 11, 
12) has provided a wealth of molecular detaH on the primary 
structure of the zona proteins and their developmentally 
regulated expression during oogenesis (&, 13). 

Less is known about the primary structure and the func- 
tions of the zona pellucida proteins from other species, 
including that of human. The human zona pellucida is com- 
posed of three glycoproteins, ZP1 (90-110 kDa), ZP2 (64-76 
kDa), and ZP3 (57-73 kDa) (14, 15), at least one of which 
appears to be modified following fertilization (15). However, 
more detailed biochemical studies of individual human zona 
pellucida proteins are difficult because of the paucity of 
biological material. An alternate approach to learning more 
about the human zona pellucida is to deduce the structure of 
individual proteins from their cognate genes. We have dem- 
onstrated (9, 10) that the genes that code for the zona 
pellucida proteins are conserved among mammals. Taking 
advantage of the cross-hybridization of a mouse cDNA with 
human DNA, we now report the isolation of full-length 
cDNA clones of human ZP3* and the characterization of the 
genomic locus of human ZP3 9 the homologue to the mouse 
sperm receptor. 

MATERIALS AND METHODS 

Human Genomic DNA. A human genomic library in Charon 
4A (16) was screened with pZP3.2 (10). A recombinant, 
AHuZP3.14, was isolated and plaque-purified. The 14.0- 
kilobase-pair (kbp) insert (containing exons 1-5) was digested 
with BamHl and EcoRI, the resultant fragments were sub- 
cloned into pBluescript, and the nucleic acid sequence of 11.7 
kbp was determined (17). 

The 3' end of the gene including exons 6-8 was obtained 
by the polymerase chain reaction (PCR) (GeneAmp; Perkin- 
Elmer/Cetus). One microgram of human genomic DNA from 
K562 cells (gift of A. Dean) was primed with oligonucleotide 
H78 (exon 6) and H77 (exon 8) for 25 cycles (94°C for 1 min, 
60°C for 2 min, and 72° for 3 min) and a final extension at 72°C 
for 15 min. The 1.8-kbp fragment was blunt end-Iigated into 
pBluescript (Stratagene) and sequenced (17). 

The 3' flanking sequences were determined by PCR of a 
circularized Xba I fragment encompassing exon 8 (Fig. 1). 
One microgram of human genomic DNA was digested with 
Xba I, extracted, precipitated with ETOH, resuspended (1 
/xg/ml), and cyclized overnight with 10 units of T4 ligase. The 
Xba I fragment containing exon 8 and the 3' flanking region 
were amplified by using exon 8 oligonucleotide primers H101 
and H86 for 25 cycles of PCR. The resultant 570-bp fragment 
was blunt end-ligated into pBluescript and sequenced (17). 

Southern Blot Analysis. Exon-specific DNA probes were 
amplified by PCR with human cDNA as a substrate (see 
below) and the following oligonucleotide primer sets: H36 

Abbreviations: PCR, polymerase chain reaction; nt, nucleotides. 
*The sequence reported in this paper has been deposited in the 
Gen Bank data base (accession no. M35109). 
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and H75 (exons 4 and 5); H78 and H81 (exons 6 and 7); and 

H80 and H77 (exon 8). Labeled probes (18) were used in 

Southern blots of human K562 DNA after digestion with a 

variety of restriction endonucleases as described (12). 
Human ZP3 cDNA. Poly(A) + RNA (19) was purified from 

total RNA (20) isolated from a human ovary (supplied by the 

National Disease Research Interchange) and used as a tem- 
plate for first-strand synthesis with oligonucleotide primer 

A2T15 according to the manufacturer's instruction (Promega 

Riboclone cDNA synthesis system). The first strand was 

amplified by PCR with A2T15 and oligonucleotide H36 from 

human exon 4 (94°C for 5 min, 50°C for 2 min, and 72°C for 

40 min, followed by 40 cycles of 94°C for 2 min, 58°C for 2 

min, and 72°C for 3 min) (21). The resultant 800-bp band was 

reamplified with A2 [lacking poly (dT)] and H74 from exon 5 

(25 cycles of 94°C for 1 min, 55°C for 2 min, and 72°C for 3 

min and a final extension at 72°C for 15 min). Hie 670-bp 

insert was blunt-end ligated into pBluescript and sequenced 

(17). The full-length cDNA was prepared by using oligonu- 
cleotide primers H87 (exon 1) and H88 (exon 8) and 25 cycles 

of PCR (94°C for 1 min, 60°C for 2 min, and 72°C for 3 min 

and a final extension at 72°C for 15 min). 
Oligonucleotide Primers. Oligonucleotide primers were 

synthesized on an Applied Biosystems DNA Synthesizer 
(model 380B). Oligonucleotide primers (5' to 3') hybridizing 
to the coding strand include: A2 (CTCGAGAAGCTTGTC- 
GAC), A2T15 (CTCGAGAAGCTTGTCGACT 15 ), H77 
(AAGCAGACACAGGGTGG), H81 (GCCTGCGGT- 
T ACGGG AA), H87 (AGATCTGAGCTCATTGCTTTCT- 
TCTTTTATTCGG AAG) , and H101 (GGAAGATCAGTG- 
GCCCC). Oligonucleotide primers (5' to 3') hybridizing to the 
noncoding strand include: H36 (TGCAGCCCACCTC- 
CAGG), H74 (CTGTCTTGTCGACGGTC), H78 (TCACCT- 
GCCACCTGAAG), H80 (CAGAAGAAGCAGATGTC), 
H86 (CTGTGGTGGTGTCCCTG), and H88 (TGCAGGG- 
TACCATGGAGCTGAGCTATAGGC). 

Computer-Aided Sequence Analysis. Sequence data collec- 
tion and comparison of nucleic acid sequences was per- 
formed with Microgenie Software (Beckman, version 6). 
Comparison of the human and mouse polypeptides allowed 
for conservative amino acid substitutions (22). Secondary 
structure of the deduced amino acid sequence of human ZP3 



was determined (23) as was its hydropathicity (24) and 
hydrophilicity (25). Signal peptide cleavage sites were de- 
fined by a sliding window-weighted matrix algorithm (26). 

RESULTS 

Genomic Locus of Human Zp-3 Homolog. A clone contain- 
ing a 14.0-kbp insert was isolated from a human genomic 
library (16) screened with a mouse ZP3 cDNA probe (10). Of 
the 14.0-kbp insert, 11.7 kbp were sequenced and contained 
the first five exons of the human Zp-3 homolog as well as «*J.l 
kbp of the 5' flanking region and 1.78 kbp of intron 5 (27). 
Despite repeated screening of the library with probes from 
mouse cDNA specific for exons 6-8, a genomic fragment 
containing the balance of the human exons was not isolated. 

Instead, the remaining exons of the human Zp-3 homolog 
were characterized by amplifying genomic DNA by the PCR 
with oligonucleotides from exons 6 and 8 (based on the cDNA 
sequence). The resultant 1.8-kbp fragment containing exons 
6-8 and intron 7 was sequenced. The size of the remaining 
intron, 5, was determined as follows. Radiolabeled probes 
corresponding to exons 4, 5, 6, and 7 all hybridized to a single 
10.5-kbp Xba I fragment from human genomic DNA (an exon 
8 probe detected a 0.6-kbp fragment). Sequence analysis 
showed that exon 5 was 1521 bp 3' of a Xba I site and that 
exon 6 was 1460 bp 5 f of a Xba I site in intron 7. Thus, the 
limits of intron 5 can be set at 7.5 kbp ± 10%. 

The resultant exon map of the human Zp-3 homolog (Fig. 
1A) is similar to that of mouse Zp-3 (Fig. IB) (11, 12). The size 
of the human exons 1-8 range from 92 to >312 bp (Table 1): 
exons 2, 3, 5, 6, and 7 are identical in length to those of the 
mouse gene, whereas exon 4 is 6 bp smaller and exon 8 is 4 
bp larger than the corresponding mouse exons. The sizes of 
the introns in the human gene (130 bp to «7.5 kb) vary 
considerably from those in mouse Zp-3 9 although the clus- 
tering of exons 3-5 and 6-8 appears to be similar in both (Fig. 
1 and Table 1). Overall, the 18.3-kbp transcription unit of the 
human Zp-3 homolog is more than twice the 8.6-kbp size of 
the mouse gene (11, 12). 

The exact determination of the transcription start site was 
precluded by the paucity of human ovarian RNA! However, 
there is a TAT AA box 43 bp upstream of the ATG (compared 

Table 1. Exon sizes, intron lengths, and splice-junction sequences of human Zp-3 homolog 



Number 


Exon size, 








Intron 


bp 


Splice-junction sequences 


length, bp 


1 


>330 






. . GTAAGAGAGGCT 


4260 


2 


119 


CCTTTCCTCCAG. . 


. . Exon 2 . . 


. GTCGGTGTGGGA 


2813 


3 


104 


TTCTTCTCTCMG. . 


. - Exon 3 . . 


. . GrAAGAGAAGAA 


449 


4 


178 


TGTGTCTTTCMG. . 


. . Exon 4 . . 


. . GTCAGCACTGGG 


389 


5 


118 


TTTTCTTCCA/G. . 


. . Exon 5 . . 


. . CTAAGAGCTTTA 


7500 ± 10% 


6 


92 


Not defined 


. . Exon 6 . . 


. . CTGAGGACAGGT 


130 


7 


137 


CTTTCATCTCjIG. . 


. . Exon 7 . . 


. . GrATGTCACAGA 


1234 


8 


229 


CTCCTTTCACviG. . 


. . Exon 8 






The italicized letters indicate splice acceptor and donor consensus sequences. 
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to 58 bp in the mouse gene) that makes it likely that the 5' 
untranslated region is similar in length to the 29 nucleotides 
(nt) of the 5' untranslated region of the mouse ZP3 mRNA. 
The 3' end of exon 8 was determined from the nucleic acid 
sequence of two separate amplified isolates of human cDNAs 
(see below) and two separate amplified isolates of human 
genomic DNA. 

Sequence Environment of Human Zp-3 Homolog. Because 
of the inexactitude of the transcription start site of the human 
gene, the 5' flanking sequences of the mouse Zp-3 gene and 
the human homolog were aligned on their TATAA boxes; in 
the mouse, the TATAA box is 30 bp upstream of the 
transcription start site (Fig. 2). Human Zp-3 homolog con- 
tains two Alu repeats (28) inverted one to the other (-254 to 
-608; -784 to —1111). The right-hand monomer of the repeat 
closest to the coding region of the gene corresponds to the 
homologous Bl repeat (29) found upstream of mouse Zp-3 
(12). Although present in the genome on average once every 
5-10 kbp (28), there are at least 16 human Alu repeat 
sequences present at the locus of the human Zp-3 homolog 
(Fig. 1). The tandem repeat found upstream of the transcrip- 
tion start site of mouse Zp-3 and the perfect 12-bp repeat 
found 3' to the gene (12) are not present in the human gene. 

There is little sequence similarity in the immediate 5' 
flanking regions of the mouse Zp-3 gene and its human 
homolog, and neither contains a canonical CCAAT box. 
However, within the first 250 bp of the human homolog and 
the first 200 bp of 5' flanking sequence of the mouse Zp-3 gene 
there are four elements that range in length from 8 to 15 bp 
and are 82-90% identical (Fig. 2). Elements 1 and 2 are very 
similarly positioned and, although elements 3 and 4 are 
located more 5' in the human than in the mouse sequence, the 
distances between the two are virtually identical (34 and 35 
bp, respectively). 

Deduced Structure of Human ZP3 mRNA and Protein. 
Several cDNAs were obtained as PCR products using 
poly(A) + RNA isolated from human ovarian tissue as a 
substrate and an o!igo(dT) adaptor primer in conjunction with 
oligonucleotide primers from exons 1, 5, or 6 of the human 
Zp-3 homolog. Three independent amplified products were 
sequenced in their entirety. The data that corresponded to the 
genomic _sequence (see above) were used to deduce the 
structure of human ZP3 mRNA. 

^ Alu Repeat A!u Repeat 



The human ZP3 mRNA, devoid of its poly(A) tail, is **1.3 
kb long with a single open reading frame of 1272 nt (Fig. 3) 
and has a nucleic acid sequence 74% identical to that of 
mouse ZP3. The initiator ATG is located in the consensus 
sequence ANNATG (30, 31), and the TAA stop codon is part 
of the polyadenylylation signal (AATAAA) that precedes the 
start of the poly(A) tail by 17 nt. The unusually short 5' and 
3' untranslated regions are similar to those of mouse ZP3 
mRNA (10) and of mouse ZP2 mRNA (8). 

The single open reading frame encodes a 424-amino acid 
protein (12% acidic, 8% basic, 7% aromatic, and 32% hydro- 
phobic residues) that is identical in length with the mouse ZP3 
protein and has a calculated molecular mass of 47,032 Da 
(Fig. 3). Overall, 67% of the amino acids are identical 
between the human and mouse ZP3 proteinsrwith the great- 
est similarity in the center of the protein (Fig. 3). The human 
polypeptide chain contains four potential N-linked glycosy- 
late sites, three of which are conserved in the mouse 
protein, including one that has been shown to be derivatized 
(9, 10). There are 66 potential O-linked glycosylation sites 
(threonines or serines), 71% of which are conserved in the 
mouse. The zona proteins are secreted, and, using the sliding 
window/matrix scoring method of von Heijne (26), we have 
identified a potential peptidase cut site after the 22nd amino 
acid that would result in a N-terminal glutamine as we have 
proposed for the mouse ZP3 protein (i0). The secreted 
protein would have a core molecular mass of 44,399 Da. 

Using the primary amino acid sequence, we compared the 
predicted secondary structure of the human and mouse 
proteins (Fig. 4). The hydropathicity profiles (24) of the two 
proteins (Fig. 4A) are strikingly similar and represent the 
conservative nature of the allowed amino acid substitutions. 
The same broad hydrophobic region near the carboxyl ter- 
minus observed in mouse ZP2 and ZP3 (8, 10) is also present 
in human ZP3. The greatest differences in hydropathicity 
occur near the N terminus (Fig. 4A) and contain a region of 
the proteins where the amino acid sequence is dissimilar 
(<40% identity). Predicted a-helical structure (23) is also well 
conserved (Fig. 4B), again reflecting the similarity between 
the two amino acid sequences. Both of the proteins are 
known to contain disulfide bonds, and interestingly, all 13 
cysteine residues found in the secreted human ZP3 polypep- 
tide are conserved in the mouse protein, suggesting a role in 
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Fig. 2. Conservation of 5' 
flanking DNA sequences of mouse 
Zp-3 and the human homolog. (A) 
Schematic representation of 1100 
bp upstream of the transcription 
start site with the sequence of the 
human Zp-3 homolog aligned on 
the TATAA box of the previously 
reported mouse gene (12). Num- 
bered circles represent conserved 
elements (1-4) defined in B. Boxes 
represent blocks of repeated se- 
quences found in the Zp-3 loci. (B) 
DNA elements in human and 
mouse conserved in sequence and 
position upstream of the Zp~3 
gene. Shaded areas indicate nu- 
cleic acid identities. 
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Fip. 3. Structure of human ZP3 mRNA and protein. The first line is the nucleic acid sequence of human ZP3 mRNA containing 1289 nt 
determined from human cDNA and genomic sequences. The initiation and stop codons are boxed, and the polyadenylylation signal is overlined. 
The single 1272-nt open reading frame is translated into a 424-amino acid peptide in the second line and aligned in the third line with the 424 
amino acids of mouse ZP3 protein (10). Identical amino acid residues in mouse and human ZP3 are shaded; conserved changes (22) are enclosed 
in boxes with dotted lines. The putative 22-amino acid signal peptide is indicated by a wavy line* and the arrow points to the proposed signal 
peptidase cut site. The four potential N-Iinked glycosylate sites [Asn-Xaa-(Thr or Ser)] of human ZP3 are bracketed from above, and the six 
potential sites of mouse ZP3 are bracketed from below. Three of the sites are conserved between the two species. J 



preserving the secondary structure of these structural pro- 
teins. 

DISCUSSION 

The single copy human ZP3 gene is composed of eight exons 
in a transcription unit of 18.3 kb. The exons, ranging in size 
from 92 to at least 312 bp, are almost identical in size to the 
eight exons of mouse Zp-3, and the nucleic acid sequence of 
the coding regions is 74% identical. We isolated a full-length 
cDNA from human ovarian poly(A) + even though the small 



amount of biological tissue available to us precluded the 
detection of ZP3 transcripts by RNA blot-hybridization 
(Northern) analysis, despite prolonged exposure times. In the 
mouse, Zp-3 is transcribed uniquely in oocytes during a 
narrow 2-week growth phase prior to meiotic maturation and 
ovulation (8, 13). Presumably, the temporal and spatial 
specificity of this expression is modulated by transcriptional 
factors (32) that interact with the 5' flanking regions of the 
zona genes. Binding sites for such factors may be conserved 
among mammals and four similar elements (8-15 bp, 82-90% 
identical; see Table 1) are present at approximately the same 
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Fig. 4. Comparison of the secondary structure of human and 
mouse ZP3 proteins. (A) Hydropathicity of human and mouse ZP3 
determined by the Kyte and Doolittle algorithm (24), indicating the 
degree of conservation between the two proteins. (B) a-Heiical 
structure of human and mouse ZP3 as determined by Gamier et al. 
(23). 

location in the first 250 bp upstream of the mouse Zp-3 gene 
and the human homolog. These sequences are distinct from 
cis-acting elements previously reported in the literature. 
Provocatively , these four elements are also present upstream 
of the transcription start site of the mouse Zp-2 gene (8, 27). 
Whether or not these DNA sequences have a functional role 
in oocyte-specific gene expression remains to be determined. 

The human ZP3 transcript is remarkably similar to the 
mouse ZP3 mRNA. Both have short 5' and 3' untranslated 
regions and both have a single open reading frame of 1272 nt 
that encodes a 424-amino acid protein. Although the molec- 
ular size of the polyadenylylated human transcript is not 
known, the size of three other ZP3 transcripts (mouse, rat, 
and rabbit) are indistinguishable from one another (10), which 
suggests that these motifs may be common to all mammalian 
ZP3 mRNAs. The deduced amino acid sequences of the 
human and mouse ZP3 protein are 67% identical, and, as 
evidenced by similarity of their predicted secondary struc- 
ture, many substitutions are conservative. Both the con- 
served central regions and the carboxyl hydrophobic domain 
of the human and mouse ZP3 may be important for interac- 
tions with ZP2 and ZP1, and the two other zona proteins of 
the mouse and human zonae pellucidae. 

The similarities of the ZP3 proteins among species may be 
important in providing structural integrity to the extracellular 
zona pellucida and in positioning protein and carbohydrate 
moieties so that they can participate in sperm-egg interac- 
tions. If, as is the case in the mouse, human ZP3 functions as 
the sperm receptor in vivo, the regions of maximal differences 
may play a role mediating high-affinity interactions that lead 
to fertilization for a particular species. Of the four regions of 
the human ZP3 protein that are most dissimilar (<40%) from 
the mouse protein (Fig. 3, amino acids 32-40, 93-102, 323- 
341, and 369-378), the two most carboxyl regions correspond 
to hydrophilic peaks (25) that presumably lie on the surface 
of the protein. One includes amino acids 338-342 that cor- 
responds to the binding site on mouse ZP3 (33) of an 
anti-mouse ZP3 monoclonal antibody (34) that further sup- 
ports a surface position for this domain. The cloning of two 



sperm receptors, each in a species for which in vitro fertili- 
zation is possible, may provide the necessary reagents with 
which to gain additional insights into those protein and 
carbohydrate domains that are critical for species-specific 
fertilization. 

We appreciate the critical reading of the manuscript by Drs. 
Graeme Wistow and Michael O'Rand. The human ovarian tissue was 
kindly supplied by the National Disease Research Interchange, 
Philadelphia. 
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